Skip to content

Feat/snowflake ddl sql import solve #789 and #885 schema.description and properties.description on snowflake and postgresql #790

Merged
jschoedl merged 32 commits intodatacontract:mainfrom
dmaresma:feat/snowflake_ddl_sql_import
Mar 24, 2026
Merged

Feat/snowflake ddl sql import solve #789 and #885 schema.description and properties.description on snowflake and postgresql #790
jschoedl merged 32 commits intodatacontract:mainfrom
dmaresma:feat/snowflake_ddl_sql_import

Conversation

@dmaresma
Copy link
Contributor

@dmaresma dmaresma commented Jun 11, 2025

Hi, I use snowflake as main EDW workspace, I notice some regression when simple-ddl-parser was replaced by sqlglot unfortunatly snowflake ddl import is a show stopper for me, in the migration of the catalog into datacontracts.
I take care to maintain sqlglot / sqlserver fonctionnality as is. the sql server test pass.

regression on tags, ddl comment/description, and data type

  • Tests pass
  • ruff format
  • README.md updated (if relevant)
  •  CHANGELOG.md entry added

@simonharrer
Copy link
Contributor

Can you elaborate on the error with sqlglot? We'd rather want to fix the issue there and not re-introduce the previously removed library.

@dmaresma
Copy link
Contributor Author

dmaresma commented Jun 13, 2025

I found the `AUTOINCREMENT START # INCREMENT [NOORDER|ORDER] do sqlglot fail, a PR is send to them (sqlglot team already) tobymao/sqlglot#5223,
Then I use ${} syntax as internal token for my DDL, sqlglot is not friendly, I substitute from the sql automaticaly (a kind a favor),
I fix the Table description et Column description in the mapping,
I fix the column tag catch too.
Thanks for your attention

@dmaresma
Copy link
Contributor Author

@simonharrer Can I ask for a review, the comment on snowflake are now well supported for snowflake, and simpleddl is removed.

@dmaresma
Copy link
Contributor Author

@simonharrer is it possible to merge this fix for snowflake sql ddl import for comment, tags ?

@dmaresma dmaresma requested a review from jochenchrist July 14, 2025 13:10
@dmaresma
Copy link
Contributor Author

I include a fix for the T-SQL money data type to decimal related to #751

@dmaresma
Copy link
Contributor Author

dmaresma commented Oct 1, 2025

@simonharrer please review this PR, it's 2 month old already and solve attributes and table comment

@dmaresma
Copy link
Contributor Author

dmaresma commented Jan 26, 2026

I'll adapt soon the move from DataContractSpecification to ODCS , re-introduce table tags (snowflake) and table comment (description), add timestamp as logicaltype as 3.1 allows it.

@dmaresma dmaresma changed the title Feat/snowflake ddl sql import solve #789 Feat/snowflake ddl sql import solve #789 and #885 schema.description and properties.description on snowflake and postgresql Jan 28, 2026
@dmaresma dmaresma requested a review from jochenchrist January 28, 2026 15:50
jschoedl and others added 4 commits March 23, 2026 15:49
…dead code

- Fix remove_variable_tokens to correctly substitute all placeholder styles
- Return None instead of string "None" from to_dialect
- Initialize table_description/table_tags to avoid UnboundLocalError
- Remove unused map_physical_type function and format parameter
- Remove duplicate startswith("number") branch in map_type_from_sql

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…extraction

The previous code called find(Property) on a Tags node, which returned
the Tags node itself (not individual Property children), then relied on
iterating the Tags node yielding its expressions. Use tags.expressions
directly for correctness. Also fixes get_tags return type annotation
(list[str] | None, not str | None) and renames shadowed variable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jschoedl
Copy link
Collaborator

Hi @dmaresma, I used Claude to refactor sql_importer.py, making it a bit easier to read. I'm not sure about the changed mapping logic yet (map_type_from_sql), but I'll get back to you regarding this later.

map_type_from_sql now returns a tuple so callers can pass format through
to create_property(). Fixes several wrong type mappings:
- binary types (BLOB, RAW, BYTEA, VARBINARY…) → string with format "binary"
- UNIQUEIDENTIFIER → string with format "uuid"
- DATETIME/DATETIME2/DATETIMEOFFSET/SMALLDATETIME → timestamp (was date)
- TIME → time (was string)
- endswith("int") no longer catches POINT

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jschoedl
Copy link
Collaborator

I used the opportunity to add a format parameter to the type, so that we match the ODCS specification for binary types. LGTM now. Thank you for your contribution!

@jschoedl
Copy link
Collaborator

By the way, please consider splitting up different changes into multiple pull requests in the future. This makes it easier to integrate them - long-running PRs with changes at different places take more time to check and are unlikely to be merged timely.

@jschoedl jschoedl merged commit 9984b5e into datacontract:main Mar 24, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants